48 research outputs found
Fault Tolerant Clustering Revisited
In discrete k-center and k-median clustering, we are given a set of points P
in a metric space M, and the task is to output a set C \subseteq ? P, |C| = k,
such that the cost of clustering P using C is as small as possible. For
k-center, the cost is the furthest a point has to travel to its nearest center,
whereas for k-median, the cost is the sum of all point to nearest center
distances. In the fault-tolerant versions of these problems, we are given an
additional parameter 1 ?\leq \ell \leq ? k, such that when computing the cost
of clustering, points are assigned to their \ell-th nearest-neighbor in C,
instead of their nearest neighbor. We provide constant factor approximation
algorithms for these problems that are both conceptually simple and highly
practical from an implementation stand-point
Approximate Nearest Neighbor Search for Low Dimensional Queries
We study the Approximate Nearest Neighbor problem for metric spaces where the
query points are constrained to lie on a subspace of low doubling dimension,
while the data is high-dimensional. We show that this problem can be solved
efficiently despite the high dimensionality of the data.Comment: 25 page
Down the Rabbit Hole: Robust Proximity Search and Density Estimation in Sublinear Space
For a set of points in , and parameters and \eps, we present
a data structure that answers (1+\eps,k)-\ANN queries in logarithmic time.
Surprisingly, the space used by the data-structure is \Otilde (n /k); that
is, the space used is sublinear in the input size if is sufficiently large.
Our approach provides a novel way to summarize geometric data, such that
meaningful proximity queries on the data can be carried out using this sketch.
Using this, we provide a sublinear space data-structure that can estimate the
density of a point set under various measures, including:
\begin{inparaenum}[(i)]
\item sum of distances of closest points to the query point, and
\item sum of squared distances of closest points to the query point.
\end{inparaenum}
Our approach generalizes to other distance based estimation of densities of
similar flavor. We also study the problem of approximating some of these
quantities when using sampling. In particular, we show that a sample of size
\Otilde (n /k) is sufficient, in some restricted cases, to estimate the above
quantities. Remarkably, the sample size has only linear dependency on the
dimension
Robust Proximity Search for Balls using Sublinear Space
Given a set of n disjoint balls b1, . . ., bn in IRd, we provide a data
structure, of near linear size, that can answer (1 \pm \epsilon)-approximate
kth-nearest neighbor queries in O(log n + 1/\epsilon^d) time, where k and
\epsilon are provided at query time. If k and \epsilon are provided in advance,
we provide a data structure to answer such queries, that requires (roughly)
O(n/k) space; that is, the data structure has sublinear space requirement if k
is sufficiently large
Purity based continuity bounds for quantum information measures
In quantum information theory, communication capacities are mostly given in
terms of entropic formulas. Continuity of such entropic quantities are
significant, as they lend themselves to maintain uniformity against
perturbations of quantum states. Traditionally, continuity bounds have been
provided in terms of the trace distance, which is a bonafide metric on the set
of quantum states. In the present contribution we derive continuity bounds for
various information measures based on the difference in purity of the concerned
quantum states. In a finite-dimensional system, we establish continuity bounds
for von Neumann entropy which depend only on purity distance and dimension of
the system. We then obtain uniform continuity bounds for conditional von
Neumann entropy in terms of purity distance which is free of the dimension of
the conditioning subsystem. Furthermore, we derive the uniform continuity
bounds for other entropic quantities like relative entropy distance, quantum
mutual information and quantum conditional mutual information. As an
application, we investigate the variation in squashed entanglement with respect
to purity. We also obtain a bound to the quantum conditional mutual information
of a quantum state which is arbitrarily close to a quantum Markov chain.Comment: We request suggestions and comment
Quantum conditional entropies and steerability of states with maximally mixed marginals
Quantum steering is an asymmetric correlation which occupies a place between
entanglement and Bell nonlocality. In the paradigmatic scenario involving the
protagonists Alice and Bob, the entangled state shared between them, is said to
be steerable from Alice to Bob, if the steering assemblage on Bob's side do not
admit a local hidden state (LHS) description. Quantum conditional entropies, on
the other hand provide for another characterization of quantum correlations.
Contrary to our common intuition conditional entropies for some entangled
states can be negative, marking a significant departure from the classical
realm. Quantum steering and quantum nonlocality in general, share an intricate
relation with quantum conditional entropies. In the present contribution, we
investigate this relationship. For a significant class, namely the two-qubit
Weyl states we show that negativity of conditional R\'enyi 2-entropy and
conditional Tsallis 2-entropy is a necessary and sufficient condition for the
violation of a suitably chosen three settings steering inequality. With respect
to the same inequality, we find an upper bound for the conditional R\'enyi
2-entropy, such that the general two-qubit state is steerable. Moving from a
particular steering inequality to local hidden state descriptions, we show that
some two-qubit Weyl states which admit a LHS model possess non-negative
conditional R\'enyi 2-entropy. However, the same does not hold true for some
non-Weyl states. Our study further investigates the relation between
non-negativity of conditional entropy and LHS models in two-qudits for the
isotropic and Werner states. There we find that whenever these states admit a
LHS model, they possess a non-negative conditional R\'enyi 2-entropy. We then
observe that the same holds true for a noisy variant of the two-qudit Werner
state.Comment: 10 page
Approximating Minimization Diagrams and Generalized Proximity Search
We investigate the classes of functions whose minimization diagrams can be
approximated efficiently in \Re^d. We present a general framework and a
data-structure that can be used to approximate the minimization diagram of such
functions. The resulting data-structure has near linear size and can answer
queries in logarithmic time. Applications include approximating the Voronoi
diagram of (additively or multiplicatively) weighted points. Our technique also
works for more general distance functions, such as metrics induced by convex
bodies, and the nearest furthest-neighbor distance to a set of point sets.
Interestingly, our framework works also for distance functions that do not
comply with the triangle inequality. For many of these functions no near-linear
size approximation was known before
Space Exploration via Proximity Search
We investigate what computational tasks can be performed on a point set in
, if we are only given black-box access to it via nearest-neighbor
search. This is a reasonable assumption if the underlying point set is either
provided implicitly, or it is stored in a data structure that can answer such
queries. In particular, we show the following: (A) One can compute an
approximate bi-criteria -center clustering of the point set, and more
generally compute a greedy permutation of the point set. (B) One can decide if
a query point is (approximately) inside the convex-hull of the point set.
We also investigate the problem of clustering the given point set, such that
meaningful proximity queries can be carried out on the centers of the clusters,
instead of the whole point set
Efficient Algorithms for k-Regret Minimizing Sets
A regret minimizing set Q is a small size representation of a much larger database P so that user queries executed on Q return answers whose scores are not much worse than those on the full dataset. In particular, a k-regret minimizing set has the property that the regret ratio between the score of the top-1 item in Q and the score of the top-k item in P is minimized, where the score of an item is the inner product of the item\u27s attributes with a user\u27s weight (preference) vector. The problem is challenging because we want to find a single representative set Q whose regret ratio is small with respect to all possible user weight vectors.
We show that k-regret minimization is NP-Complete for all dimensions d>=3, settling an open problem from Chester et al. [VLDB 2014]. Our main algorithmic contributions are two approximation algorithms, both with provable guarantees, one based on coresets and another based on hitting sets. We perform extensive experimental evaluation of our algorithms, using both real-world and synthetic data, and compare their performance against the solution proposed in [VLDB 14]. The results show that our algorithms are significantly faster and scalable to much larger sets than the greedy algorithm of Chester et al. for comparable quality answers